System and method for adjusting multiple resources across multiple workloads

ABSTRACT

Increased workload performance is obtained by coordinating a multi-resource computer system such that demands for resources are arbitrated across all available resources and all applications such that the proper resource will be adjusted regardless of which resource is needed to improve workload performance. In operation, a measurement is taken for each available resource to determine the enhancement achieved by adding a certain quantity of a resource. In one embodiment, resource consumption and performance data is collected over a period of time and that data is used to adjust resource requests for a workload in order to improve the workload&#39;s performance. The resource request is modified to deliver the most workload benefit for each resource modification.

FIELD OF THE INVENTION

This disclosure relates to computer systems and more particularly tosystems and methods for computer workload management.

DESCRIPTION OF RELATED ART

Currently, computer goal-based workload management systems operate toadjust the CPU in response to an arbitrary measure of performance forany arbitrary workload. The key problem with this is that if a workloadis not CPU intensive, the adjustment of CPU may not improve theperformance of the workload.

One option is to simply use resource utilization to adjust multipleresources. One problem with this approach is that it may waste resourcesbecause some applications may receive performance that far exceeds therequirements for the application. This problem is compounded in thatworkloads may react differently to the availability of differentresources and an adjustment solution must work for any arbitraryworkload and it must work for any measure of performance for thatworkload.

Another issue is that a workload's performance may be impacted byresource contention caused by other workloads. Such contention can causeresource requirements to vary over time based on what the application isdoing at the time and on the other applications that are running on thesystem at that time and what stage such application is in.

In some arrangements, a computer system workload is affected by theamount and type of resources that are available to the workload at anyparticular time. Thus, when a workload is underperforming it isdesirable to adjust the resources that are available to it.

Current systems address a single resource and, hence, require separateresource allocation policies for each computer system resource that canbe adjusted. These “single” resource management systems add complexityto defining a resource allocation policy for workload managementsystems.

Workload management is the approach of adjusting resource entitlements(such as the number of CPUs, the amount of memory, etc.) to workloadsbased on workload performance data. When multiple resources are beingadjusted it is difficult to determine which resource to adjust toachieve optimum results. It is also difficult to know how much a givenresource change will improve the performance of the workload.

As an example, if a system is measuring the response time of a workloadand it has the ability to adjust the entitlements of, for example, CPU,memory, disk I/O bandwidth or network bandwidth, how does it know whichof these should be adjusted to improve the response time of theworkload?

BRIEF SUMMARY OF THE INVENTION

There are disclosed systems and methods for coordinating amulti-resource computer system such that demands for resources arearbitrated across all available resources and all applications such thatthe proper resource will be adjusted to increase the proper workloadperformance regardless of which resource is needed to improve workloadperformance. In one embodiment, the system tracks performance dataacross all resources so that the system knows for all resources what toexpect from a resource adjustment at any point in time. Using the systemand methods disclosed, any desired resource adjustment is tempered toinsure that maximum benefit is derived from such an adjustment.Arbitration is used to mediate between competing resource requests.

In one embodiment, resource allocation vectors are used to determineallocation of resources that will improve a workload's performance. Inoperation, a measurement is taken for each available resource todetermine the enhancement achieved by adding a certain quantity of aresource. In this manner a historical profile is created for a point intime dependant upon the workload's actual response at that time tochanges in resource availability. When the performance of a workloadrequires enhancing by the adjustment of a resource, the historicalprofile is used as a vector by the workload policy controller to adjustresource to achieve the desired enhanced performance.

In one embodiment, resource consumption and performance data iscollected over a period of time and that data is used to adjust resourcerequests for a workload in order to improve the workload's performance.The resource request is modified to deliver the most workload benefitfor each resource modification.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present invention, reference isnow made to the following descriptions taken in conjunction with theaccompanying drawings, in which:

FIG. 1 shows one embodiment of a system having multiple resourcesavailable to a plurality of workloads;

FIG. 2 shows one embodiment of a system for adjusting multiple resourcesfor multiple applications;

FIG. 3 shows one embodiment of a computer system having multipleresource capabilities; and

FIG. 4 shows one embodiment of a process for controlling workloadresource allocation.

DETAILED DESCRIPTION

FIG. 1 shows one embodiment 10 of a multi-resource (11-1 to 11-N)computer system serving workloads (applications) 12-1 to 12-N. Theresources are managed by workload management (WLM) tools 13 and 14,working from input from adjust resource request 25-1 (FIG. 2). Each WLMadjusts the amount of each resource required by application 1 or by anyother application. FIG. 1 shows two resources, 11-1 and 11-N, whichtypically would be memory and CPU, but could be any resource(s), such asbandwidth, network, I/O bandwidth, kernel data structure space, processtable entries, etc.

WLM tools 13 and 14 are most likely a single instance of WLM and, aswill be seen, operate to change the partitions 15-1 to 15-4 for eachresource for each application as necessary.

The structure shown in FIG. 2 is one embodiment of a system foradjusting multiple resources for a single application.

FIG. 2 shows one embodiment 20 of a system and method for adjustingmultiple resources 11-1 to 11-N for multiple applications 12-1 to 12-N.Embodiment 20 can, if desired, be part of WLM 13, 14 or could bestand-alone on part of controller. The discussion of FIG. 2 addressesonly a single resource for each application, but multiple resources areconsidered for each application.

For discussion purposes, let us assume that resource 11-1 is memorymonitored by WLM tool 13 (FIG. 1) and that resource 11-N is CPUmonitored by WLM tool 14 (FIG. 1). The process starts with thecollection of performance metrics by process 21-1 for application 1 asthat application is running on the system. The collected data is thencompared to the determined resource requirements by process 22-1operating in conjunction with application 1 resource consumption profile24-1. Consumption profile 24-1 operates to add the proper measure of theresource based as calculated gain the most performance from the workload(application).

Resource requirements can be based on a basic or sophisticated profilebased controller algorithm. The resources that a workload (application)has available to it depends upon the workload's utilization of thoseresources. For example, if the workload is entitled to use 60 shares ofCPU and is using 40, that is a 66% utilization of CPU. If the sameworkload has access to 2 Gigabytes of memory (not shown) and is using1.5 Gigabytes, that is 75% utilization. If it has access to 2Gigabytes/sec of network bandwidth and is using 1.9 Gigabytes, that is95% utilization. A resource manager seeing that utilization is over acertain threshold level might then call for additional resources.

It would be easy to look at these resource requests and assume thatbecause network bandwidth is at 95% utilization the problem is withnetwork capacity. This may or may not be a factor in slower thanexpected workload processing by calculating individual resourcepressure. The system can, based upon a knowledge of how each resourceimpacts workload performance, adjust a resource request based on thelikelihood that the request resource will actually help improve theperformance workload. As an example, memory can be at 95% utilization.Adding memory will have no impact on performance since the workload'stotal data is already in memory. This is in contrast to the CPU risingabove 80%, as it starts impacting performance due to process contextswitching being performed continuously. The time for such processingbecomes excessive when CPU utilization goes above 80%.

Based upon the input from process 22-1 and input from process 23-1 whichcollects the actual resource consumption by application 1, process 25-1issues commands for adjusting resource requests. These commands are sentto the proper WLM tool (in this case tool 13) to change the partition(15-1, FIG. 1) for resource 11-1. Process 25-1 can, if desired, examineresource performance patterns which reflect knowledge about how aparticular resource impacts the performance of a workload this knowledgecould be put into the system by a system user, but most typically wouldbe gathered over time and stored, for example, in memory 15 (FIG. 1).The purpose of this operation is to understand how adding (or removing)resources helps, does nothing or possibly hinders performance. Thus, asdiscussed above, a request for additional memory may not be the solutionto a performance problem even if the memory is at 95% utilization.

Process 25-1 adjusts the resource requests (for all resources forapplication 1) based on the utilization of the same resource in theprior interval and the pattern of how these resources impact performanceand sends these requests to resource arbiter 26. This can be doneserially on a resource by resource basis, or all at one time, asdesired.

Thus, process for resource 11-N (and for any other resource) is the sameas for resource 11-1, except performed by processes 21-N through 25-N.Note that while the processes for resources 11-N are shown separatelyfrom the processes for resource 11-1, they, in fact could be the same.Also note that while separate processes 21-1 to 25-1 are shown, theycould also be a single process or any combination thereof.

The process for application 12-N (and for any other application) is thesame as application 12-1, which is performed by processes 21-N through25-N, such that adjust source request 25-N sends requests for all neededresources (with respect to application N-1) to arbiter 26. Arbiter 26then determines the mediation between resources and between applicationsto maximize the overall system operation. Arbiter 26 can work on allresources or on one at a time, as desired.

Note that while the processes for applications 12-N are shown separatelyfrom the processes for applications 12-1, they, in fact could be thesame and used serially. Also note that while separate processes 21-1 to25-1 are shown, they could also be a single process or any combinationthereof.

FIG. 3 shows one embodiment 30 of computer system 310 having multipleresource capabilities which resources can be used as needed to increase(or decrease) a workload's performance. In the embodiment of FIG. 3workload 31-1 can use CPUs 34-1 to 34-N and memory 35-1 and I/Obandwidth 36-1 to 36-N. Particular workload, such as workload 31,typically has a single dimensional value (e.g., database transactiontime) that is used to monitor performance. However, the workload'sperformance is a function of the allocation of multiple computer systemresources (e.g., CPU, memory, I/O, etc.). The response to increasing oneresource over another resource may be dramatically different. Forexample, some applications may not benefit at all from an increase inCPU resources, but instead may improve dramatically to increases in say,memory allocation.

Workload manager (WLM) 48, working with policy objects 47-1 to 47-Ncontrol the resource allocation in conjunction with process 40, as willbe discussed with respect to FIG. 4.

FIG. 4 shows one embodiment 40 of a process for controlling workloadresource allocations such that process 401 in conjunction with WLMdetermines that a particular workload performance needs improvement.

In process 402 the WLM determines the proportion (scalar) of the currentallocation to reduce by using a previously calculated resourceallocation vector (as will be discussed hereinafter). Process 403calculates the workload allocation to equal the old allocation plus theproportion (resource allocation vector) to reduce or add the neededresources. Process 404 changes the resource allocation for the workloadunder control of the WLM.

Note that as discussed above, processes 401 through 404 operate on theassumption that an allocation vector has previously been established forthe next change to occur. If it is time for reestablishing an updatedresource allocation vector, process 405 initiates process 407 whichdetermines the resource type, i.e., CPU memory input/output, bandwidth,etc.

Process 408 removes the target resource to be updated from the list ofresource types available. Process 409 changes the allocation of theresource by delta units. Process 410 takes a measurement of theimprovement in the workload performance; this is the improvement.

Process 411 then normalizes the scalar by determining that the componentequals the delta divided by the improvement. If the improvement is zero,then the component equals the minimum increment for this resource. Thismeans that if there has been no improvement by increasing the resourcethere is no need to continue to change the resource.

Process 407 then begins a process of iterations such that a differentresource component is tested and if there are more resource types to betested remaining then processes 408, 409, 410 and 411 continue. When allresource types have been tested, process 414 updates each resourceallocation factor in the workload policy controller which is part of theWLM.

Note that with processes 407, 408, 409, 410, 411 and 412, each resourcein turn is tested to determine what effect a change in that resourcewill have on the operation (performance) of the workload. Subsequently,this resource allocation vector is used in process 403. This is doneafter process 402 in which WLM determines the magnitude (proportion) ofthe change in resource allocation needed by the workload. These settingsare then maintained in WLM and used in processes 402, 403 and 404 to setthe resource to the proper level when an adjustment in resources isnecessary. Thus, in process 401 when the determination is made that aworkload performance needs improvement, process 402 looks in itsallocation of scalars and determines which scalars to apply to whichresources and the resource is adjusted. From prior iterations it wasknown that a certain adjustment will result in a certain increase and soas a result when a resource is added it is highly likely thatperformance will be enhanced. Process 406 continues monitoring resourceallocation that no updates are taking place.

As discussed above, the initial unit of resource allocation is notcritical, since it is the vector that determines which resource shouldbe adjusted and by how much. The system effectively uses pre-profilingof each resource response to a particular workload after a certainperiod of time, or whenever a given resource allocation reaches itsmaximum (or minimum), the allocation vector is recalculated directly, asa moving average, or as a smoothed combination of previous vectors.

Thus, as discussed, multiple computer system resources are consideredwhen allocating resources to a workload so that the workload can meetits performance criteria. The systems and methods discussed herein makeresource allocation policy definition easier by allowing for a singlespecification for multiple computer system resources, based on anhistorical response of the workload to changes in each individualresource allocation to the workload.

As discussed, process 40 can run, if desired, in a global controller(not shown) or in one or more of the resource managers. Process 401collects resource consumption data by extracting data from the system ona resource by resource basis to determine how much of each resource wasconsumed by the workload in the prior interval. This data can come fromthe resource managers or from other sources and can be stored in storage(not shown) if desired.

Process 40 makes it possible to adjust multiple resource entitlementssimultaneously and have a reasonable likelihood of making appropriateadjustments that will improve the response time of the workload, withoutwasting resources that are not likely to improve performance.

1. A method of operating a multi-resource, multi-workload computer, saidmethod comprising: gathering data on resource availability; gatheringdata on workload performance on a per workload basis; and adjustingresource requests for each workload based upon said gathered resourceavailability and gathered workload performance data.
 2. The method ofclaim 1 wherein said adjusting comprises: arbitrating among resourcerequests across all workloads.
 3. The method of claim 1 wherein saidadjusting comprises: selecting the proper amount of a resource toadjust.
 4. A method of managing performance in a multi-workload,multi-resource computer system, said method comprising: collectingresource consumption and performance data on each resource available insaid computer system, said data collected on an individual workloadbasis; accepting incoming resource requests for a particular resourcefor enhancing the performance of a particular workload; and adjustingsaid resource request based upon said collected resource consumption andperformance data.
 5. The method of claim 4 wherein said adjustingcomprises: selecting the proper resource to adjust.
 6. The method ofclaim 4 wherein said adjusting comprises: selecting the proper amount ofa resource to adjust.
 7. The method of claim 4 wherein said adjustingfurther comprises: arbitrating among competing resource requests.
 8. Acomputer system comprising: a plurality of resources available for useby a plurality of workloads; means for collecting resource consumptiondata pertaining to the utilization of resources across workloads; meansfor collecting resource performance patterns; means for acceptingrequests for resource modifications for a particular workload; and meansfor adjusting any said received request based upon said collectedresource consumption data and said collected resource performancepatterns.
 9. The computer system of claim 8 further comprising: meansfor arbitrating across workloads for competing resource requests. 10.The computer system of claim 9 further comprising: a controlleroperating across a plurality of said workloads for controlling saidresource adjustments.
 11. A computer system comprising: a workloadmanager for controlling multiple resource allocations to differentworkloads running on said computer, said manager comprising: memory forstoring therein resource consumption data on a resource by resourcebasis and resource performance data on a resource by resource basis; andcontrol for adjusting resource requests for each workload based uponsaid stored consumption and performance data.
 12. The computer system ofclaim 11 further comprising: an arbiter for arbitrating adjustmentsbetween competing resource requests.
 13. The computer system of claim 12wherein said control comprises: a process for observing performanceresults by changing one or more of a plurality of available resources ona particular workload; and means for repeating said observing for eachavailable resource.
 14. The computer system of claim 13 furthercomprising: a process for selecting based on said stored profiles whichresource should be adjusted at any particular time.
 15. The computersystem of claim 12 wherein said resource arbitration is across workloadsas well as resources.
 16. A computer program product having computerreadable media, said media comprising: code for controlling thegathering of data on resource availability; code for controlling thegathering of data on resource performance; and code for controlling theadjustment of resource requests based upon said gathered resourceavailability and gathered resource performance data.
 17. The computerprogram product of claim 16 wherein said code for controlling theadjustment comprises: code for controlling arbitration among resourcerequests.
 18. The computer program product of claim 17 wherein saidarbitration is across a plurality of servers.
 19. The computer programproduct of claim 17 wherein said code for controlling the adjustmentcomprises: code for controlling the selection of the proper resource toadjust.
 20. The computer program product of claim 17 wherein said codefor controlling the adjustment comprises: code for controlling theselection of the proper amount of a resource to adjust.
 21. A method forenhancing computer performance, said method comprising: observingperformance results by changing one of a plurality of availableresources on a particular workload; repeating said observing for eachavailable resource; and storing the results of said observing as aprofile of each resource with respect to said workload, said storedresults available for use in adjusting resources with respect to saidworkload when such adjusting becomes necessary.
 22. The method of claim21 further comprising: selecting based on said stored profiles whichresource should be adjusted at any particular time.
 23. The method ofclaim 22 further comprising: repeating said observing, repeating andstoring from time to time.
 24. A computer program product havingcomputer readable media stored thereon, said computer readable mediacomprising: code for controlling the determination for each availableresource a degree of performance change occasioned by a change in saidresource on a particular workload; and code for controlling theselection, based on said determined degree of performance change foreach said resource, which resource to be added when said particularworkload requires additional performance.
 25. The product of claim 24wherein said computer readable media further comprises: code forcontrolling the selection of the degree of change in said selectedresource.
 26. The product of claim 25 wherein said computer readablemedia further comprises: code for controlling the repetition of saiddetermining to arrive at a revised degree of performance changeoccasioned by a change in said resource on said particular workload. 27.A computer system comprising: a plurality of resources available for useby workloads running on said computer system; means for determining foreach available resource a degree of performance change occasioned by achange in said resource on a particular workload running on saidcomputer system; and means operable based on said determining means forindicating to said workload manager which resource should be added whensaid particular workload requires additional performance.
 28. The systemof claim 27 further comprising: means for determining when a particularworkload requires additional resources.
 29. The system of claim 27further comprising: means for repeating said determining to arrive at arevised degree of performance change occasioned by a change in saidresource on said particular workload.
 30. The system of claim 29 furthercomprising: means for selecting the degree of change in said selectedresource.
 31. A method for adjusting resources on a computer system,said method comprising: determining that a workload needs performanceimprovement; determining a proportional scalar to apply to a currentresource allocation; and changing the current resource allocation inaccordance with said determined proportional scalar.
 32. The method ofclaim 31 wherein said scalar determining comprises: from time to timechanging the allocation of a selected resource by a certain amount; andmeasuring the result of said changing on workload performance.
 33. Themethod of claim 32 further comprising: normalizing said proportionalscalar for said selected resource based on said measured result.
 34. Themethod of claim 33 further comprising: removing said changed allocationof said selected resource; changing the allocation of a second selectedresource by a certain amount; measuring the result of said changing onworkload performance; and normalizing said proportional scalar for saidsecond selected resource based on said measured result.
 35. The methodof claim 34 wherein said resources are spread among a plurality ofpartitions.
 36. A computer system comprising: a plurality of resourcesavailable to process a workload; resource adjustment control forprocessing requests for resource adjustments to improve workloadprocessing for said workload; and a process for modifying any suchprocessed requests for a particular resource adjustment such that onlythe resources calculated to deliver the most workload benefit from anysuch adjustment are modified.
 37. The computer system of claim 36wherein said resource adjustment control comprises: a workload managerfor controlling resource assignments to said workload running on saidcomputer; memory for storing therein resource consumption data andresource performance data; and control for adjusting resource requestsbased upon said stored consumption and performance data.
 38. Thecomputer system of claim 37 further comprising: a plurality of workloadssharing said resources.
 39. A method for resource allocation in amulti-resource computer system, said method comprising: determining foreach available resource a degree of performance change occasioned by achange in said resource on a particular workload; and selecting, basedon said determined degree of performance change for each said resource,which resource to be added when said particular workload requiresadditional performance.
 40. The method of claim 39 wherein saidselecting further comprises: selecting the degree of change in saidselected resource.
 41. The method of claim 40 further comprising:repeating said determining to arrive at a revised degree of performancechange occasioned by a change in said resource on said particularworkload.
 42. The method of claim 41 further comprising: storing saidperformance change information in memory.
 43. A computer systemcomprising: a plurality of resources available for use by workloadsrunning on said computer system; a workload manager for determining foreach available resource a degree of performance change occasioned by achange in said resource on a particular workload running on saidcomputer system; and wherein said workload manager is further operablebased on said determining for indicating to said workload manager whichresource should be adjusted when said particular workload requiresadditional performance.
 44. The computer system of claim 43 wherein saidworkload manager is further operable for determining when a particularworkload requires additional resources.
 45. The computer system of claim43 wherein said workload manager is further operable for repeating saiddetermining to arrive at a revised degree of performance changeoccasioned by a change in said resource on said particular workload. 46.The computer system of claim 45 wherein said workload manager is furtheroperable for selecting the degree of change in said selected resource.47. The method for controlling resource adjustments with respect to aworkload, said method comprising: collecting performance data withrespect to said workload; determining the satisfaction level of saidworkload at a particular time; demanding an adjustment in a resource fora particular workload; and selecting which resource of a plurality ofresources should be adjusted to obtain said workload's performancegoals.
 48. The method of claim 47 further comprising: determining themagnitude of said selected resource adjustment.
 49. The method of claim48 wherein said determining comprises: observing performance results bychanging one of a plurality of resources available to said workload;repeating said observing for each available resource; storing theresults of said observing as a profile of each resource with respect tosaid workload, said stored results available for use in adjustingresources with respect to said workload when such adjusting becomesnecessary.
 50. The method of claim 49 further comprising: selecting,based on said stored profiles, which resource should be adjusted at anyparticular time.
 51. The method of claim 49 further comprising:repeating said observing, repeating and storing from time to time.
 52. Acomputer program product having computer readable media stored thereon,said computer readable media comprising: code for controlling thecollection of performance data with respect to a workload; code forcontrolling the determination of the satisfaction level of said workloadat a particular time; code for controlling an adjustment in a resourcefor said workload; and code for controlling the selection of theresource from a plurality of resources that could be adjusted to obtainsaid workload's performance goals.
 53. The computer program product ofclaim 52 further comprising: code for controlling the determination ofthe magnitude that a selected resource should be adjusted.